Scalable Active Temporal Constrained Clustering
نویسندگان
چکیده
We introduce a novel interactive framework to handle both instance-level and temporal smoothness constraints for clustering large temporal data. It consists of a constrained clustering algorithm which optimizes the clustering quality, constraint violation and the historical cost between consecutive data snapshots. At the center of our framework is a simple yet effective active learning technique for iteratively selecting the most informative pairs of objects to query users about, and updating the clustering with new constraints. Those constraints are then propagated inside each snapshot and between snapshots via constraint inheritance and propagation to further enhance the results. Experiments show better or comparable clustering results than existing techniques as well as high scalability for large datasets.
منابع مشابه
Active Semi-Supervision for Pairwise Constrained Clustering
Semi-supervised clustering uses a small amount of supervised data to aid unsupervised learning. One typical approach specifies a limited number of must-link and cannotlink constraints between pairs of examples. This paper presents a pairwise constrained clustering framework and a new method for actively selecting informative pairwise constraints to get improved clustering performance. The clust...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملSimple and Scalable Constrained Clustering: a Generalized Spectral Method
We present a simple spectral approach to the well-studied constrained clustering problem. It captures constrained clustering as a generalized eigenvalue problem with graph Laplacians. The algorithm works in nearly-linear time and provides concrete guarantees for the quality of the clusters, at least for the case of 2-way partitioning. In practice this translates to a very fast implementation th...
متن کاملConstrained Spectral Clustering under a Local Proximity Structure Assumption
This work focuses on incorporating pairwise constraints into a spectral clustering algorithm. A new constrained spectral clustering method is proposed, as well as an active constraint acquisition technique and a heuristic for parameter selection. We demonstrate that our constrained spectral clustering method, CSC, works well when the data exhibits what we term local proximity structure. Empiric...
متن کاملActive Constrained Clustering by Examining Spectral Eigenvectors
This work focuses on the active selection of pairwise constraints for spectral clustering. We develop and analyze a technique for Active Constrained Clustering by Examining Spectral eigenvectorS (ACCESS) derived from a similarity matrix. The ACCESS method uses an analysis based on the theoretical properties of spectral decomposition to identify data items that are likely to be located on the bo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018